Target Speech Extraction: Independent Vector Extraction Guided by Supervised Speaker Identification

نویسندگان

چکیده

This manuscript proposes a novel robust procedure for the extraction of speaker interest (SOI) from mixture audio sources. The estimation SOI is performed via independent vector (IVE). Since blind IVE cannot distinguish target source by itself, it guided towards frame-wise identification based on deep learning. Still, an incorrect can be extracted due to guidance failings, especially when processing challenging data. To identify such cases, we propose criterion non-intrusively assessing estimated speaker. It utilizes same model as identification, so no additional training required. When detected, “deflation” step in which subtracted and, subsequently, another attempt extract performed. process repeated until successful achieved. proposed experimentally tested artificial and real-world datasets containing phenomena: movements, reverberation, transient noise, or microphone failures. method compared with state-of-the-art algorithms well current fully supervised learning-based methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Research on Domain-independent Opinion Target Extraction

Opinion Target Extraction is one of the important tasks for text sentiment analysis, which has attracted much attention from many researchers. For this task, we proposed an M-Score algorithm utilized in the model which realized the domain-independent opinion target extraction function. This algorithm is derived from the Pointwise Mutual Information algorithm, but the difference is that it doesn...

متن کامل

Speech feature extraction using independent component analysis

In this paper, we proposed new speech features using independent component analysis to human speeches. When independent component analysis is applied to speech signals for efficient encoding the adapted basis functions resemble Gabor-like features. Trained basis functions have some redundancies, so we select some of the basis functions by reordering method. The basis functions are almost ordere...

متن کامل

Supervised Opinion Aspect Extraction by Exploiting Past Extraction Results

One of the key tasks of sentiment analysis of product reviews is to extract product aspects or features that users have expressed opinions on. In this work, we focus on using supervised sequence labeling as the base approach to performing the task. Although several extraction methods using sequence labeling methods such as Conditional Random Fields (CRF) and Hidden Markov Models (HMM) have been...

متن کامل

Invariant-integration method for robust feature extraction in speaker-independent speech recognition

The vocal tract length (VTL) is one of the variabilities that speaker-independent automatic speech recognition (ASR) systems encounter. Standard methods to compensate for the effects of different VTLs within the processing stages of the ASR systems often have a high computational effort. By using an appropriate warping scheme for the frequency centers of the timefrequency analysis, a change in ...

متن کامل

Non-Linear I-vector Extraction for Speaker Recognition

We propose an algorithm for non-linear i-vector extraction. The algorithm is based on the manifold learning technique named Diffusion Maps (DM) and motivated by recent results that showed that the GMM supervectors reside on a low dimensional manifold. Our proposed method may further be processed using standard techniques such as Linear Discriminant Analysis (LDA), Within Class Covariance Normal...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing

سال: 2022

ISSN: ['2329-9304', '2329-9290']

DOI: https://doi.org/10.1109/taslp.2022.3190739